Everything about Self-organizing Map totally explained
A
self-organizing map (SOM) is a type of
artificial neural network that's trained using
unsupervised learning to produce a low-dimensional (typically two dimensional), discretized representation of the input space of the training samples, called a
map. The map seeks to preserve the
topological properties of the input space.
This makes SOM useful for
visualizing low-dimensional views of high-dimensional data, akin to
multidimensional scaling. The model was first described as an artificial neural network by the
Finnish professor
Teuvo Kohonen, and is sometimes called a
Kohonen map.
Like most artificial neural networks, SOMs operate in two modes: training and mapping. Training builds the map using input examples. It is a competitive process, also called
vector quantization. Mapping automatically classifies a new input vector.
Network structure
A self-organizing map consists of components called nodes or neurons. Associated with each node is a weight vector of the same dimension as the input data vectors and a position in the map space. The usual arrangement of nodes is a regular spacing in a hexagonal or rectangular grid. The self-organizing map describes a mapping from a higher dimensional input space to a lower dimensional map space. The procedure for placing a vector from data space onto the map is to find the node with the closest weight vector to the vector taken from data space and to assign the map coordinates of this node to our vector.
While it's typical to consider this type of network structure as related to
feedforward networks where the nodes are visualized as being attached, this type of architecture is fundamentally different in arrangement and motivation.
Useful extensions include using
toroidal grids where opposite edges are connected and using large numbers of nodes. It has been shown that while self-organizing maps with a small number of nodes behave in a way that's similar to
K-means, larger self-organizing maps rearrange data in a way that's fundamentally topological in character.
It is also common to use the U-matrix. The U-matrix value of a particular node is the average distance between the node and its closest neighbors. In a rectangular grid for instance, we might consider the closest 4 or 8 nodes.
Large SOMs display properties which are emergent. Therefore, large maps are preferable to smaller ones. In maps consisting of thousands of nodes, it's possible to perform cluster operations on the map itself.
Learning algorithm
The goal of learning in the self-organizing map is to cause different parts of the network to respond similarly to certain input patterns. This is partly motivated by how visual, auditory or other
sensory information is handled in separate parts of the
cerebral cortex in the
human brain.
The weights of the neurons are initialized either to small random values or sampled evenly from the subspace spanned by the two largest
principal component eigenvectors. With the latter alternative, learning is much faster because the initial weights already give good approximation of SOM weights.
The network must be fed a large number of example vectors that represent, as close as possible, the kinds of vectors expected during mapping. The examples are usually administered several times.
The training utilizes
competitive learning. When a training example is fed to the network, its
Euclidean distance to all weight vectors is computed. The neuron with weight vector most similar to the input is called the best matching unit (BMU). The weights of the BMU and neurons close to it in the SOM lattice are adjusted towards the input vector. The magnitude of the change decreases with time and with distance from the BMU. The update formula for a neuron with weight vector
Wv(t) is
» Wv(t + 1) =
Wv(t) + Θ (v, t) α(t)(
D(t) -
Wv(t)),
where α(t) is a
monotonically decreasing learning coefficient and
D(t) is the input vector. The neighborhood function Θ (v, t) depends on the lattice distance between the BMU and neuron
v. In the simplest form it's one for all neurons close enough to BMU and zero for others, but a
gaussian function is a common choice, too. Regardless of the functional form, the neighborhood function shrinks with time..
Alternatives
Generative topographic maps (GTM) are a potential alternative to SOMs. In the sense that GTM explicitly requires a smooth and continuous mapping from the input space to the map space, it's topology preserving. However, in a practical sense, this measure of topological preservation is lacking.
Further Information
Get more info on 'Self-organizing Map'.
|
External Link Exchanges
Do you know how hard it is to get a link from a large encyclopaedia? Well we're different and will prove it. To get a link from us just add the following HTML to your site on a relevant page:
<a href="http://self-organizing_map.totallyexplained.com">Self-organizing map Totally Explained</a>
Then simply click through this link from your web page. Our crawlers will verify your link, extract the title of your web page and instantly add a link back to it. If you like you can remove the words Totally Explained and embed the link in article text.
As long as your link remains in place, we'll keep our link to you right here. Please play fair - our crawlers are watching. Your site must be closely related to this one's topic. Any kind of spamming, dubious practises or removing the link will result in your link from us being dropped and, potentially, your whole site being banned. |